AITopics | sparse solution

Parametric Simplex Method for Sparse Learning

Neural Information Processing SystemsMar-17-2026, 18:55:39 GMT

High dimensional sparse learning has imposed a great computational challenge to large scale data analysis. In this paper, we investiage a broad class of sparse learning approaches formulated as linear programs parametrized by a {\em regularization factor}, and solve them by the parametric simplex method (PSM). PSM offers significant advantages over other competing methods: (1) PSM naturally obtains the complete solution path for all values of the regularization parameter; (2) PSM provides a high precision dual certificate stopping criterion; (3) PSM yields sparse solutions through very few iterations, and the solution sparsity significantly reduces the computational cost per iteration. Particularly, we demonstrate the superiority of PSM over various sparse learning approaches, including Dantzig selector for sparse linear regression, sparse support vector machine for sparse linear classification, and sparse differential network estimation. We then provide sufficient conditions under which PSM always outputs sparse solutions such that its computational performance can be significantly boosted. Thorough numerical experiments are provided to demonstrate the outstanding performance of the PSM method.

artificial intelligence, machine learning, proceedings, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.61)

Add feedback

Exact Recovery of Hard Thresholding Pursuit

Neural Information Processing SystemsMar-17-2026, 11:35:36 GMT

The Hard Thresholding Pursuit (HTP) is a class of truncated gradient descent methods for finding sparse solutions of $\ell_0$-constrained loss minimization problems. The HTP-style methods have been shown to have strong approximation guarantee and impressive numerical performance in high dimensional statistical learning applications. However, the current theoretical treatment of these methods has traditionally been restricted to the analysis of parameter estimation consistency. It remains an open problem to analyze the support recovery performance (a.k.a., sparsistency) of this type of methods for recovering the global minimizer of the original NP-hard problem. In this paper, we bridge this gap by showing, for the first time, that exact recovery of the global sparse minimizer is possible for HTP-style methods under restricted strong condition number bounding conditions. We further show that HTP-style methods are able to recover the support of certain relaxed sparse solutions without assuming bounded restricted strong condition number. Numerical results on simulated data confirms our theoretical predictions.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.98)

Add feedback

5e34a2b4c23f4de585fb09a7f546f527-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 22:26:44 GMT

attraction, equilibria, steady state, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Parametric Simplex Method for Sparse Learning

Neural Information Processing SystemsNov-21-2025, 16:17:59 GMT

High dimensional sparse learning has imposed a great computational challenge to large scale data analysis. In this paper, we investiage a broad class of sparse learning approaches formulated as linear programs parametrized by a {\em regularization factor}, and solve them by the parametric simplex method (PSM). PSM offers significant advantages over other competing methods: (1) PSM naturally obtains the complete solution path for all values of the regularization parameter; (2) PSM provides a high precision dual certificate stopping criterion; (3) PSM yields sparse solutions through very few iterations, and the solution sparsity significantly reduces the computational cost per iteration. Particularly, we demonstrate the superiority of PSM over various sparse learning approaches, including Dantzig selector for sparse linear regression, sparse support vector machine for sparse linear classification, and sparse differential network estimation. We then provide sufficient conditions under which PSM always outputs sparse solutions such that its computational performance can be significantly boosted. Thorough numerical experiments are provided to demonstrate the outstanding performance of the PSM method.

name change, parametric simplex method, sparse learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.61)

Add feedback

Exact Recovery of Hard Thresholding Pursuit

Neural Information Processing SystemsNov-21-2025, 15:32:55 GMT

The Hard Thresholding Pursuit (HTP) is a class of truncated gradient descent methods for finding sparse solutions of $\ell_0$-constrained loss minimization problems. The HTP-style methods have been shown to have strong approximation guarantee and impressive numerical performance in high dimensional statistical learning applications. However, the current theoretical treatment of these methods has traditionally been restricted to the analysis of parameter estimation consistency. It remains an open problem to analyze the support recovery performance (a.k.a., sparsistency) of this type of methods for recovering the global minimizer of the original NP-hard problem. In this paper, we bridge this gap by showing, for the first time, that exact recovery of the global sparse minimizer is possible for HTP-style methods under restricted strong condition number bounding conditions. We further show that HTP-style methods are able to recover the support of certain relaxed sparse solutions without assuming bounded restricted strong condition number. Numerical results on simulated data confirms our theoretical predictions.

exact recovery, hard thresholding pursuit, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.98)

Add feedback

From Bayesian Sparsity to Gated Recurrent Nets

Hao He, Bo Xin, Satoshi Ikehata, David Wipf

Neural Information Processing SystemsNov-21-2025, 13:31:28 GMT

However, the Achilles' heel of all these approaches is that

artificial intelligence, iteration, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Coordinate Descent for Network Linearization

Rakhlin, Vlad, Jevnisek, Amir, Avidan, Shai

arXiv.org Machine LearningNov-18-2025

ReLU activations are the main bottleneck in Private Inference that is based on ResNet networks. This is because they incur significant inference latency. Reducing ReLU count is a discrete optimization problem, and there are two common ways to approach it. Most current state-of-the-art methods are based on a smooth approximation that jointly optimizes network accuracy and ReLU budget at once. However, the last hard thresholding step of the optimization usually introduces a large performance loss. We take an alternative approach that works directly in the discrete domain by leveraging Coordinate Descent as our optimization framework. In contrast to previous methods, this yields a sparse solution by design. We demonstrate, through extensive experiments, that our method is State of the Art on common benchmarks.

artificial intelligence, budget, optimization problem, (15 more...)

arXiv.org Machine Learning

2511.11781

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Africa > Rwanda > Kigali > Kigali (0.04)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Checklist 1. For all authors (a)

Neural Information Processing SystemsAug-17-2025, 06:53:26 GMT

Do the main claims made in the abstract and introduction accurately reflect the paper's If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Y es] (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they Did you report error bars (e.g., with respect to the random seed after running experiments multiple times)? Did you include the total amount of compute and the type of resources used (e.g., type Did you include any new assets either in the supplemental material or as a URL? [N/A] Did you discuss whether and how consent was obtained from people whose data you're If you used crowdsourcing or conducted research with human subjects... (a) The full version of the Table 1 is given in Table 3. That is, the following relationships hold: 2g p uq " sup This formulation can be found in Lemma 3.1 of Jenatton et al. First we compute the gradient g p uq " ÿ A.7 Log Sum First, we compute the derivative g puq " log p? u ` ε q ùñ g " 0, (46) which gives the inverse mapping? However, it is separable, and in one dimension we have g p uq " null tu ą 0u . " au ě m uq, (69) where ConvpAq is the convex hull of the set A. Similarly define s S Running for 1000 epochs, for example, gets the fraction of nonzeros down to around 0.1, at a slight expense of accuracy.

artificial intelligence, machine learning, penalty, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback